R Notebooks are a ‘flavor’ of R markdown that combine plain text with R commands in code chunks. You can (and should!) edit them, and save it every few minutes like anything else!
Type and run R commands in the code chunks (ctrl + enter will run the current line). The output will appear below the code chunk. If you’re in RStudio, you can reduce the height of the console but don’t minimize it completely because messages, progress bars, appear there.
Click the ‘Preview’ button in the toolbar to see an HTML preview what you’ve run so far.
Keyboard shortcuts:
run the current line of R: ctrl + enter
run everything in the current code chunk: ctrl + shift + enter
insert a new code chunk: ctrl + alt + i
If you haven’t already installed all the packages needed for the workshop, run this install script: https://bit.ly/36Z90iH
First we load the packages we need for this exercise, which include caladaptr and several packages from the tidyverse:
library(caladaptr)
library(sf)
library(dplyr)
library(lubridate)
library(units)
library(ggplot2)
library(tidyr)
library(tmap)
library(stringr)
dplyr in particular has a number of very generic function names, so we tell R which one we want it to use using the conflicted package:
library(conflicted)
conflict_prefer("filter", "dplyr", quiet = TRUE)
conflict_prefer("count", "dplyr", quiet = TRUE)
conflict_prefer("select", "dplyr", quiet = TRUE)
In this exercise we’ll explore the city of Merced’s climate - past and future.
The Cal-Adapt server has almost a dozen boundary layers you can use to specify a location when retrieving data. caladaptr refers to these polygon areas as “areas-of-interest presets”. You can see the list of these by viewing the constant aoipreset_types:
aoipreset_types
[1] "censustracts" "counties" "cdistricts" "ccc4aregions" "climregions" "hydrounits"
[7] "irwm" "electricutilities" "wecc-load-area" "evtlocations" "place"
The place preset contains the boundaries for cities.
## Download the place layer as a simple feature data frame
places_sf <- ca_aoipreset_geom("place")
Reading layer `place' from data source `C:\Users\Andy\AppData\Local\R\cache\R\caladaptr\place.gpkg' using driver `GPKG'
Simple feature collection with 1522 features and 10 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: -13833610 ymin: 3833633 xmax: -12715920 ymax: 5159971
projected CRS: WGS 84 / Pseudo-Mercator
To plot the places layer:
library(tmap)
tmap_mode("view")
tmap mode set to interactive viewing
tm_shape(places_sf) +
tm_polygons()
There are two ways to find the geoid value for Merced:
click on the map until you find it
search the attribute table:
places_sf %>%
st_drop_geometry() %>%
filter(stringr::str_detect(name, "Merced"))
It looks like “0646898” is the geoid value for the city of Merced. Let’s plot it by itself to make sure:
my_geoid <- "0646898"
tm_shape(places_sf %>% filter(geoid == my_geoid)) +
tm_polygons(col = "red", alpha = 0.2)
Yes that looks like Merced.
Find the geodid of your home city.
Plot it.
Challenge: get the longitude and latitude of your home city. Solution
## geodid of my home city:
Cal-Adapt has two kinds of historic data - ‘observed’ and modeled. For better comparability with future projections, we’ll look at modeled historic data, which are generated by the same GCMs that make the future projections.
The first thing we need to do is to decide which of the 10 available GCMs we want to use. We’ll pick the first 4 which have been identified as the priority models for California:
## View all the GCMs
gcms
[1] "HadGEM2-ES" "CNRM-CM5" "CanESM2" "MIROC5" "ACCESS1-0" "CCSM4" "CESM1-BGC" "CMCC-CMS" "GFDL-CM3" "HadGEM2-CC"
[11] "ens32avg" "ens32max" "ens32min"
## Save the first four
my_gcms <- gcms[1:4]
my_gcms
[1] "HadGEM2-ES" "CNRM-CM5" "CanESM2" "MIROC5"
Next, we need to identify which emissions scenario we want to use in our request:
## Available emissions scenarios
scenarios
[1] "rcp45" "rcp85" "historical"
Other decisions we have to make include:
Do we want annual, monthly or daily data?
Do we want the minimum or maximum temperature for each time period?
What date range are we interested in?
How do we want to aggregate the values of the grid cells that cover Merced?
Note: Cal-Adapt does not have data for all combinations of parameters. In a future lesson, we’ll see how to look at the Cal-Adapt API data catalog to see which datasets are available, but for now we’ll make these decisions:
period: yearly
climate variable: maximum temperature (in this case for each year)
Date rate: 1975 - 2005 (30 years)
Spatial aggregation function: mean
We now have all the pieces to build an API request:
mercd_hist_cap <- ca_loc_aoipreset(type = "place", idfld = "geoid", idval = my_geoid) %>%
ca_period("year") %>%
ca_gcm(my_gcms) %>%
ca_scenario("historical") %>%
ca_cvar(c("tasmin", "tasmax")) %>%
ca_years(start = 1975, end = 2005) %>%
ca_options(spatial_ag = "mean")
mercd_hist_cap
Cal-Adapt API Request
Location(s):
AOI Preset: place
geoid(s): 0646898
Variable(s): tasmin, tasmax
Temporal aggregration period(s): year
GCM(s): HadGEM2-ES, CNRM-CM5, CanESM2, MIROC5
Scenario(s): historical
Slug(s): NA
Dates: 1975-01-01 to 2005-12-31
Options:
spatial ag: mean
temporal ag (add'l): NA
We can see how many LOCA grid cells cover Merced by plotting the API request and adding the locagrid = TRUE argument.
plot(mercd_hist_cap, locagrid = TRUE)
NOTE: just because you build an API request doesn’t mean Cal-Adapt has a matching data layer. API requests are evaluated when you try to fetch data; you can also check the raster series catalog. If a matching data layer doesn’t exist an error will be returned.
Now we’re ready to fetch data:
mercd_hist_tbl <- mercd_hist_cap %>% ca_getvals_tbl()
|
| | 0%
|
|================== | 14%
|
|===================================== | 29%
|
|======================================================= | 43%
|
|========================================================================= | 57%
|
|=========================================================================================== | 71%
|
|============================================================================================================== | 86%
|
|================================================================================================================================| 100%
## backup: mercd_hist_tbl <- readRDS("data/mercd_hist_tbl.rds")
mercd_hist_tbl
This looks pretty good, but the temperature column is in degrees Kelvin. We can use the units package to add a column in Fahrenheit.
mercd_hist_tbl2 <- mercd_hist_tbl %>%
mutate(temp_f = set_units(val, degF), year = year(dt))
head(mercd_hist_tbl2)
Plot the time series:
ggplot(data = mercd_hist_tbl2 %>% filter(cvar == "tasmax"),
aes(x = year, y = as.numeric(temp_f))) +
geom_line(aes(color=gcm)) +
labs(title = "Annual Maximum Temperature for Merced", x = "year", y = "temp (F)",
subtitle = "Modelled historic data, 1975-2005")
Make a plot of the modeled minimum annual temperature for Merced from 1975-2005. Solution
We can compute the mean modelled annual temp by grouping the rows by year and GCM and taking the mean of temp_f. This assumes the temperature distribution is evenly spread between the minimum and maximum temperature.
mercd_hist_mean_tbl <- mercd_hist_tbl2 %>%
group_by(year, gcm) %>%
summarise(mean_temp = mean(temp_f))
`summarise()` regrouping output by 'year' (override with `.groups` argument)
mercd_hist_mean_tbl
Plot it:
ggplot(data = mercd_hist_mean_tbl,
aes(x = year, y = as.numeric(mean_temp))) +
geom_line(aes(color=gcm)) +
labs(title = "Annual Mean Temperature for Merced", x = "year", y = "temp (F)",
subtitle = "Modelled historic data, 1975-2005")
Next let’s look at projected future temperature (2020-2099)
mercd_prj_cap <- ca_loc_aoipreset(type = "place", idfld = "geoid", idval = my_geoid) %>%
ca_period("year") %>%
ca_gcm(my_gcms) %>%
ca_scenario(c("rcp45", "rcp85")) %>%
ca_cvar(c("tasmax", "tasmin")) %>%
ca_years(start = 2020, end = 2099) %>%
ca_options(spatial_ag = "mean")
mercd_prj_cap
Cal-Adapt API Request
Location(s):
AOI Preset: place
geoid(s): 0646898
Variable(s): tasmax, tasmin
Temporal aggregration period(s): year
GCM(s): HadGEM2-ES, CNRM-CM5, CanESM2, MIROC5
Scenario(s): rcp45, rcp85
Slug(s): NA
Dates: 2020-01-01 to 2099-12-31
Options:
spatial ag: mean
temporal ag (add'l): NA
Here we i) fetch data, ii) add columns for Fahrenheit and year, and iii) return a subset of columns:
mercd_prj_tbl <- mercd_prj_cap %>%
ca_getvals_tbl() %>%
mutate(temp_f = set_units(val, degF), year = year(dt)) %>%
select(year, cvar, scenario, gcm, temp_f)
|
| | 0%
|
|========= | 7%
|
|================= | 13%
|
|========================== | 20%
|
|================================== | 27%
|
|=========================================== | 33%
|
|=================================================== | 40%
|
|============================================================ | 47%
|
|==================================================================== | 53%
|
|============================================================================= | 60%
|
|===================================================================================== | 67%
|
|============================================================================================== | 73%
|
|====================================================================================================== | 80%
|
|=============================================================================================================== | 87%
|
|======================================================================================================================= | 93%
|
|================================================================================================================================| 100%
## backup: mercd_prj_tbl <- readRDS("data/mercd_prj_tbl.rds")
dim(mercd_prj_tbl)
[1] 1280 5
head(mercd_prj_tbl)
Our tibble mercd_prj_tbl has values for 80 years, 2 climate variables (tasmin and tasmax), 2 RCPs, and 4 GCMs. There are many ways we can slice and dice this.
Let’s plot the mean annual temperature over time for each emissions scenario, keeping the GCMs as separate series. Start by computing the mean annual projected temp as the average of tasmin and tasmax:
mercd_prj_mean_tbl <- mercd_prj_tbl %>%
group_by(year, gcm, scenario) %>%
summarise(mean_temp = mean(temp_f))
`summarise()` regrouping output by 'year', 'gcm' (override with `.groups` argument)
mercd_prj_mean_tbl
Now make the plot, with each RCP in its own panel or facet:
ggplot(data = mercd_prj_mean_tbl,
aes(x = year, y = as.numeric(mean_temp))) +
geom_line(aes(color=gcm)) +
facet_grid(scenario ~ .) +
labs(title = "Annual Mean Temperature for Merced", x = "year", y = "temp (F)",
subtitle = "2020-2099")
Take the average of the mean temperature of all the GCMs so there’s only one mean annual temperature per RCP.
Plot both RCPs on the same plot.
mercd_prj_mean2_tbl <- mercd_prj_mean_tbl %>%
group_by(year, scenario) %>%
summarise(mean_all_gcms = mean(mean_temp))
`summarise()` regrouping output by 'year' (override with `.groups` argument)
mercd_prj_mean2_tbl
ggplot(data = mercd_prj_mean2_tbl,
aes(x = year, y = as.numeric(mean_all_gcms))) +
geom_line(aes(color = scenario)) +
labs(title = "Annual Mean Temperature for Merced", x = "year", y = "temp (F)",
subtitle = "Average of 4 GCMs. 2020-2099")